Statistical Properties of Statistical Ensembles of Stock Returns 
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We select n stocks traded in the New York Stock Exchange 
and we form a statistical ensemble of daily stock returns for 
each of the k trading days of our database from the stock 
price time series. We analyze each ensemble of stock returns 
by extracting its first four central moments. We observe that 
these moments are fluctuating in time and are stochastic pro- 
cesses themselves. We characterize the statistical properties 
of central moments by investigating their probability density 
function and temporal correlation properties. 

In the last years a large amount of statistical analyses 
of the dynamics of the price time series of a single stock 
has been performed by physicists interested in the mod- 
eling of financial markets [jy. In this paper we present 
the results of an empirical analysis performed by tak- 
ing a different approach. We investigate the statistical 
properties of daily returns of n selected stocks simulta- 
neously traded in a financial market. There are two main 
motivations for this kind of analysis. From a fundamen- 
tal point of view our analysis may help in understanding 
collective behaviors in stock markets. These behaviors 
become of great importance in times of financial turmoil 
when stocks in the market become more interlinked, and 
during market crashes. From an applied point of view 
our analysis may be useful in the management of large 
portfolio of stocks. 

The investigated market is the New York Stock Ex- 
change (NYSE) during the 12-year period January 1987 
to April 1999 comprising 3113 trading days. We select 
four ensembles of n stocks. The number of stocks in each 
ensemble is not always constant during the investigated 
period because the number of stocks is rapidly increasing 
in the NYSE, ranging from approximately 1100 in 1987 
to approximately 2800 in 1999. Old stocks disappear and 
new ones start to be traded in the market. Moreover for 
each ensemble we consider only the stocks traded in the 
NYSE and we exclude those traded in the NASDAQ or 
AMEX market. Hence n is constant or slowly increas- 
ing with time in the selected ensembles of stocks: (i) 
n = 30 stocks which are used to compute the Dow Jones 
Industrial Average (DJIA30); (ii) n > 86 stocks which 
are used to compute the Standard & Poors 100 Index 
(SP100); (iii) n > 313 stocks which are used to compute 
the Standard & Poors 500 Index (SP500), and (iv) all the 
n > 1100 stocks traded in NYSE (NYSE). The variable 
investigated in our analysis is the daily return, which is 
defined as 
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where Yi(t) is the closure price of i— th stock at day 
t. For each day we consider n returns, n is about 
30,90,420,2100 depending on the chosen set. A first 
analysis concerns the distribution of returns at a given 
day t. We observe that in many days the central part 
of this distribution is approximated by a Laplace or dou- 
ble exponential distribution Significant changes in 
the shape and scale are frequently observed, especially 
in times of financial turmoil ||. Laplace distribution 
has been considered recently in economic analysis of the 
growth dynamics of companies B. In order to charac- 
terize more quantitatively the return distribution at day 
t, we determine the first four central moments for each 
of the 3113 trading days of the 4 sets of stocks consid- 
ered. Specifically, we consider the mean, the standard 
deviation, the skewness and the kurtosis defined as 
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The mean p(t) gives a measure of the general trend of 
the market at day t. The standard deviation a(t) con- 
trols the width of the distribution and gives a measure of 
the variety of behaviors observed in the financial market. 
A large value of a(t) indicates that different companies 
show very different behaviors at day t. Skewness p(t) and 
kurtosis n(t) are scale-free parameters, whose values de- 
pend on the shape of the distribution but not on its scale. 
p(t) describes the asymmetry of the distribution with re- 
spect to p(t). A positive value of p(t) indicates that few 
companies perform great profits, and many companies 
have small losses at day t with respect to the mean. A 
negative value of p{t) corresponds to the complementary 
case. Finally the kurtosis n(t) gives a measure of the dis- 
tance of the distribution from a Gaussian distribution. 
In our analysis we have discarded returns whose absolute 
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value was |i2»(t)| > 0.5, because some of these returns 
might 
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FIG. 1. Probability density function of the standard devia- 
tion (variety) a(t) for the four ensembles considered: DJIA30 
(circle), SP100 (square), SP500 (diamond) and NYSE (trian- 
gle). 

be attributed to errors in the database. Similar errors 
would strongly affect the estimation of higher moments 
because statistical analyses of moments of a distribution 
higher than the second arc more and more sensible to 
extreme values. 

We obtain the values of the four moments for each 
trading day. These quantities arc not constant but fluc- 
tuate in time. By observing the time evolution of /L*(t) , 
we note that several trading days are present in which big 
jumps of the mean are observed. These findings can be 
evaluated more quantitatively by investigating the empir- 
ical probability density function (PDF) of p(t) temporal 
series. This PDF is also approximated by a Laplace dis- 
tribution. 

In Fig. 1 we show the PDFs of the variety a(t). We 
observe that the distribution is roughly log-normal with 
a approximately power-law tail observed for the higher 
values for each ensemble considered. We note that distri- 
butions do not coincide for different ensembles. In partic- 
ular the mean of a(t) increases by increasing the number 
of stocks considered and by decreasing the (average) cap- 
italization of the stocks. Indeed the stocks which com- 
pose the DJIA30 set have great capitalization and small 
volatility. On the other hand in the NYSE set there are 
present companies with both small and large capitaliza- 
tion and with different levels of volatility. The NYSE 
set is more heterogeneous than DJIA30 set and this is 
reflected in a greater value of variety <r(t). 

The higher moments are extremely sensible to rare 
events. The PDF of the skewness is non-Gaussian with 
fat tails and is slightly asymmetrical around the value 
p = (especially for NYSE set). Positive values of the 
skewness are a bit more probable than negative ones. 
This implies that days in which few companies reach 



great gains and many companies have small losses with 
respect to the mean are slightly more frequent than the 
complementary case. The PDFs of the kurtosis P(k) 
are approximately characterized by a power-law tail k~ 7 
for higher values of k for the four ensembles of stocks. 
The exponent 7 of the power-law region is approximately 
equal to 2 and becomes slightly greater moving from the 
NYSE to the DJIA30 sets. In summary the first four 
central moments of the distribution of daily returns are 
distributed in a non-trivial way. 
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FIG. 2. Log-log plot of the autocorrelation function R(t) 
of the variety a(t) as a function of the time lag r (in trad- 
ing days) for the four ensembles considered: DJIA30 (cir- 
cle), SP100 (square), SP500 (diamond) and NYSE (triangle). 
In the inset we show the time evolution of the variety for 
NYSE ensemble. On 19 October 1987 variety reached the 
value a — 0.096 out of scale in the inset. 

In order to better characterize the temporal evolutions 
of p(t) and <r(t), we investigate their memory properties. 
To this end we calculate their autocorrelation functions. 
We find that the mean is delta correlated, whereas a dif- 
ferent behavior is observed for a(t). Fig. 2 shows the 
autocorrelation function of the variety a(t) for the four 
ensembles considered in a log- log plot. We observe that 
the autocorrelation function of empirical data is well ap- 
proximated by a power-law function R(t) oc t~ s . By 
performing a best fit of R(r) with a maximum time lag 
of 50 trading days, we determine S as 0.27 (DJIA30), 0.25 
(SP100), 0.26 (SP500) and 0.20 (NYSE). These results 
indicate that a long-time memory is present in the market 
for the variety cr(t). We observe a power-law autocorre- 
lation function also for the quantity \p(t)\. We mention 
that the behaviors of the autocorrelation function of p(t) 
and a(t) are consistent with the results of our analysis 
of their Fourier transform. We observe that p(t) has a 
white noise power spectrum, whereas the variety a(t) has 
a power-law power spectrum. 

In conclusion we have introduced the concept of variety 
of a statistical ensemble of stocks traded in a financial 
market. Statistical properties of variety are non-trivial 
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and are characterized by a non-Gaussian PDF and by a 
long-term time memory. 
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